2022/09/29

One factoral design

Research question: Does happiness differ between the Swiss regions?

Data

Hypotheses

Hypothesis 1: The respondents from the 7 regions reported different mean happiness levels.

Hypothesis 2: Respondents from Espace Mittelland reported higher mean happiness levels than Zentralschweiz.

Look at the data - numerical

Nuts2 n mean trimmed10 median sd var skew kurt
Région lémanique 570 2.91 2.91 3 0.92 0.85 0.17 3.58
Espace Mittelland 720 2.72 2.72 3 0.89 0.79 0.36 3.01
Nordwestschweiz 427 2.70 2.70 3 0.88 0.78 0.47 4.05
Zürich 513 2.76 2.76 3 0.97 0.95 0.59 3.93
Ostschweiz 433 2.73 2.73 3 0.95 0.90 0.75 4.56
Zentralschweiz 275 2.61 2.61 3 0.85 0.73 0.62 4.00
Ticino 160 2.91 2.91 3 0.98 0.95 1.16 6.14

Look at the data - graphical

Box plots are excellent to display distributions.
Why are they not a good choice in case?

Look at the data - graphical

WARNING: depending on the bin size histograms can be misleading.

Look at the data - graphical

Quantile-Quantile-plots are a great way to compare the sample distribution to a theoretical distribution. Ideally, the points would match the line.

Why do we see a stair pattern?

Look at the data - graphical

add some random noise (normal [0, 0.5])

Analysis - parametric

Omnibus

oneway.test(H1~Nuts2,var.equal=FALSE, data=df_1f)
## 
##  One-way analysis of means (not assuming equal variances)
## 
## data:  H1 and Nuts2
## F = 5.2507, num df = 6.0, denom df = 1030.9, p-value = 2.434e-05

Levine and Hullett (2002) recommend Ω² or η² as effect size for ANOVAs.

  • partial η² (used by SPSS) strongly depends on the variability of the residuals
  • η² biased e.g. when n is small or there are many levels
aov(H1~Nuts2, data=df_1f) %>% effectsize::omega_squared(verbose=F) %>% toTable()
Parameter Omega2 CI CI_low CI_high
Nuts2 0.0079776 0.95 0.002278 1

Hypothesis 1: The respondents from the 7 regions reported different mean happiness levels. –> Null-Hypothesis can be rejected, but the effect is minimal

Analysis - parametric

Contrasts

f1_lm <- lm(H1~Nuts2, data=df_1f)
f1_emm <- emmeans::emmeans(f1_lm, 'Nuts2', data=df_1f)
emmeans::test(
  emmeans::contrast(
    f1_emm, 
    list(ac1=c(0, 1, 0, 0, 0, -1, 0)) # this list can contain multiple contrasts
    ),
  adjust='none'
  )
##  contrast estimate     SE   df t.ratio p.value
##  ac1         0.114 0.0651 3091   1.753  0.0797